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In bulk quantum computation one can manipulate a large number of indistinguishable quantum 
computers by parallel unitary operations and measure expectation vàlues of certain observables 
with limited sensitivity. The initial state of each computer in the ensemble is known but not pure. 
Methods for obtaining effective pure input states by a series of manipulations have been described 
by Gershenfeld and Chuang [^,^ (logical labeling) and Cory et al. JÍ|^| (spatial averaging) for the 
case of quantum computation with nuclear magnètic resonance. We give a different technique called 
£ — , temporal averaging. This method is based on classical randomization, requires no ancilla qubits 

0^ 1 and can be implemented in nuclear magnètic resonance without using gradient fields. We introduce 

several temporal averaging algorithms suitable for both high temperature and low temperature bulk 
quantum computing and analyze the signal to noise behavior of each. 
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I. INTRODUCTION 
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Quantum computation involves the transformation of one known pure quantum state into another unknown state, 
which can be measured to provide a computationally useful output. Traditionally, it has been understood that an 
. important part of this process is proper preparation of a fiducial initial pure state, such that the computational input 
\Q ' is well known, and the output is thus meaningful. In particular, it has usually been assumed that the input cannot 
be a stochastic mixture. However, two groups JÏJ-Q] have recently shown that by using a different technique, called 
bulk quantum computation, the same computation can be performed but with an initial mixture state, which is often 
much easier to achieve experimentally. Bulk quantum computation is being implemented for small numbers of qubits 
fi ' using nuclear magnètic resonance (NMR) techniques. 

Bulk quantum computation is performed on a large ensemble of indistinguishable quantum computers. At the 
beginning of a computation, each mcmbcr c of the ensemble is in an initial state p Cj o such that the average po = 
Exp(/9 Cj o) of thcsc states is known. A bulk computation with such an ensemble can be divided into three steps 
consisting of preparation, computation and readout. Each of these steps is equivalent to an application of the same 
quantum operation to each member of the ensemble. The purpose of the preparation step is to transform the input 
state to an effective pure state which permits an unbiased observation of the output of the algorithm. The computation 
• i-h . is assumed to be a fixed unitary operator derived from a Standard quantum algorithm, that is an algorithm with a 
one qubit answer. We wish to determine this answer on input |0) (the state where every qubit is |0)). The readout 

procedure may include some postprocessing of the algorithm's output and terminates in the measurement of the 

i (i) 

observable a z , the spin along the z-axis of the first qubit. In bulk quantum computation, the measurement yields a 
noisy version of the average value of over the ensemble of quantum computers. For our signal to noise analyses, 
we assume that the noise is unbiased with variance s 2 . 

Formally, a bulk quantum computation of an algorithm implementing the unitary transformation C with preparation 
and postprocessing operations V and 1Z transforms po to 

Pont = RiCP jP oP]&Rl (1) 

i, 3 

where the Ri and Pj are the operators in a linear representation of the quantum operations V and 1Z ||. The 
measurement step of the readout procedure yields tr(p · u t<7^ ) with noise. In the methods investigated in this paper, 
1Z is unitary, usually the identity. The purpose of V is to create an effective pure state. The simplest example of an 
effective pure state[j is a density matrix of the form 



fi 



1 Cory et al. [|3p| call this a pseudo-pure state. 
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^2p jPo pí= p \0){0\ + ±I. 



(2) 
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Here N — dim(7) = 2™, where n is the number of qubits. If 1Z = I, then p, 



'out — 



pC\0){0\& + so that 



tr(p o „ t a«)=ptr(C|0)<0|CV«). 



(3) 



If the excess probability p of the ground state |0) is larger than the smallest detectable signal, we are able to determine 
whether the output of a Standard algorithm is or 1 by learning whether the measurement yields a negative or a 
positive value. To achieve sufhcient confidence in the answer or to learn more about the average answer, the bulk 
computation is repeated several times. Confidence c in the answer of a Standard algorithm at a signal to noise ratio 
of SNR per experiment requires ~ log(l/c)/SNR 2 experiments. 

Prior to the present work, there were two approaches to implementing an effective pure state preparation procedure. 
These approaches may be classificd as spatial averaging and logical labeling. Spatial averaging was introduced and 
implemented by Cory et al. |3|,|| . In general, spatial averaging involves partitioning the ensemble of quantum computers 
into a number of subensembles and applying a diffcrcnt unitary operator to each of them. Given enough subensembles 
and proper choices of unitary operators, the average density matrix over the whole ensemble can be transformed into an 
effective pure state. This procedure requires methods for distinguishing between quantum computers in the ensemble. 
In NMR this can be accomplished by using well-known gradient pulse methods to address individual celis in a bulk 
sample. The celis in the implementation of Cory et al. are two dimensional slices of constant magnètic fields defined 
by a transient gradient. The logical labeling technique of Gershenfeld and Chuang |ï|,gj is fundamentally different; it 
avoids the use of explicit subensembles by exploiting ancillary qubits as labels. An initial unitary transformation is 
applied which redistributes the states in such a way that conditional on the state of the labels, an effective pure state 
is obtained in the qubits to be used for computation. Gershenfeld and Chuang demonstrated that this can be done 
cfhcicntly in the high temperature limit for non-interacting qubits, where po can be expressed as a small deviation 
from jfl. 

Here, we consider a new and different technique: Temporal averaging. Rather than attempting to guarantee an 
effective pure state in a single experiment, this method uses several experiments with different preparation steps chosen 
either systematically or randomly. The measurements from each experiment are averaged to give the final answer. 
The preparation steps are chosen such that the average of the prepared input states is an effective pure state. The 
advantages of this method are that no ancillary qubits are needed, it can be implemented to work at any temperature 
and it is not necessary to distinguish subensembles of quantum computers. In the high temperature regime it can 
be implemented efficiently without any loss of signal, and in general, the signal to noise ratios are sufficiently well 
behaved to permit efficient determination of the desired answer to any given level of confidence. 

We will describe several temporal averaging methods and discuss their properties. Temporal averaging methods 
can be loosely categorized into high temperature and low temperature methods. The high temperature methods 
tend to be simpler and are the most efficient for NMR quantum computations involving small numbers of qubits. 
Three such methods will be described: Exhaustive averaging, labeled flip&swap and randomized flip&swap. Labeled 
flip&swap uses a limited form of logical labeling to obtain the desired answer in two experiments with only one ancilla, 
while randomized flip&swap needs no ancillas but may require additional experiments to overcome noise from the 
randomization procedure. Flip&swap methods rely on an inversion symmetry of high temperature thermal states of 
non-interacting partides. Low temperature methods do not require special assumptions on the initial state, but tend 
to use more operations to implement. Two such methods are of interest, randomization over a group and averaging by 
cntanglement. The first depends on which unitary group is used. We will show that there are groups which yield good 
signal to noise behavior and which can be implemented in cúbic time. Averaging by entanglement has the advantage 
of requiring fewer experiments, but necessitates discarding some of the qubits. This method may be useful if some of 
the qubits are discarded anyway for the purpose of polarization enhancement by computational cooling, a family of 
techniques for statically or dynamically increasing polarization of the ground state for a subset of the available qubits. 

The different temporal averaging methods are introduced and analyzed in the following sections. We begin with a 
simple example borrowed from NMR, discuss exhaustive averaging and the flip&swap methods, show how randomized 
averaging over a group can be used and give the method based on entanglement. More detailed descriptions of the 
algorithms and the mathematical analyses are in the appendix. It is assumed that the reader is familiar with the 
bàsic concepts of quantum computation and nuclear magnètic resonance [|| . 
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II. NMR EXAMPLE 
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To illustrate the ideas on which temporal averaging is based, consider a two qubit example from room tempcraturc 
NMR with líquids. The density matrix of an AX system consisting of a proton and a carbon-13 nucleus in a 400MHz 
spectrometer is approximately given by 



(4) 



How to calculate these input states will be discussed below. Because all relevant observables are traceless, we focus 
our attention on the second matrix, the deviation density matrix. Suppose our goal is to perform some computation 
C on the ground state 1 00) (00 1 and then to observe a z on the proton. For this observation, the states |01)(01|, |10)(10| 
and |11)(11| constitute noise. To remove this noise we can exploit the fact that the computation and the observation 
are linear in the input. We perform three experiments, each with a different preparation step which permutes the 
undesirablc input states, and then average the output. The first experiment uses the unmodified input, corresponding 
to preparation with P = I. The second permutes |01)(01| -> 1 10) <10| -> |11){11| -> |01)(01| using the unitary 
transformation 



Pi 



10 
10 
1 
10 



(5) 



This results in the input state 



Pi = PipPÏ 
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(6) 



The third preparation applies the inverse permutation P2 = P/ to produce the input state 
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(7) 



The average of the input density matrices is then given by 

p=\y^pppi 
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(8) 



The average of the measurements of Og after a computation gives tr(C pC 1 ai 1 ') = 1.333 • 10 5 tr(C|00)(00|CVÍ^). 
It can be seen that the contributions to the measurements of the undesirable input states have been eliminated. In 
NMR, aP is measured by applying a radiofrequeney pulse to rotate the magnetization of the target spin into the 
plane and observing the free induction decay as discussed in [B . 
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III. EXHAUSTIVE AVERAGING 



The example of the previous section is an instance of exhaustive averaging. For n qubits, it involves cyclicly 
permuting the non-ground states in 2™ — 1 difïerent ways such that the average of the prepared states is given by 
(poo — p)|0) (0| + pl. This method works for any initial state which is diagonal in the computational basis states. 
Although the number of experiments required grows exponentially, it is reasonable to consider implementing it for 
small numbers of qubits. 

To design the quantum network for the preparation steps, one can exploit the structure of the Galois field GF(2 n ). 
If the non-ground initial states are labeled by elements of GF(2 n ), multiplication by a non-zero element x of this field 
implcmcnts one of the cyclic permutations. Sincc multiplication can be implcmcntcd with a reasonable (quadratic) 
number of controlled nots, each such x yields a preparation operator P x . Details are given in the appendix. The seven 
networks needed to exhaustively average three qubits are in Figure [ï]. 
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FIG. 1. Networks required for state preparation when implementing exhaustive averaging for three qubits using con- 
trolled-nots and swaps. The networks shown perform the six non-identity cyclic permutations. Seven experiments are per- 
formed, one with no special preparation and six with the preparation networks above. © symbols denote the target qubits of 
the controlled-not gates, and • symbols denote the control. 



The signal to noise ratio of exhaustive averaging is determined by the sensitivity of each measurement, the excess 
probability in the ground state and the number of experiments being performcd. If the initial density matrix is p = 
Si Pii\ï){A with < i < 2 n — 1, then the average density matrix over all experiments is given by p — (poq— p)\0) (0| +pl, 
where p = 2 „ 1 _ 1 X^ï" 1 Pa- ^ ^ nc computation's output is x — tv{C\Q){0\C^Uz), then the observed average signal 
is (poo — P)x. Given that the variance of the noise in each measurement is s 2 , the Standard deviation of the noise in 
the average is s/%/2" — 1, which gives an overall signal to noise ratio of v / 2™ — ï(poo —p)x/s. Typically, the density 
matrix will describe a high temperature, polarized system of non-interacting spins, in which case poo ~ nS/2 n , where 
6 is the single spin polarization (see Section IV). It is also convenient to define SNRi = S/s as the signal to noise 



(9) 



ratio from a single spin measurement, such that we may express the signal to noise ratio of exhaustive averaging as 

SNR > ^- V2" - ïldSNRi . 

This argument assumes no bias in the individual measurements. To ensure that exhaustive averaging works correctly 
for Standard quantum algorithms, the bias must be small compared to (poo — í>)/2". 
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IV. FLIP&SWAP 



Flip&Swap is a method which exploits special properties of the high temperature thermal state for non-intcracting 
partides to create an efïective pure state with few experiments. If the internal Hamlitonian of a collection of n qubits 
is given by Tí, then the thermal state is given by 

p-pn 

where (3 — 1/kgT is the usual Boltzmann factor and 1/2 is the partition function normalization factor. At high 
temperatures, a good approximation is to take 

PinnjjV-pH), (11) 

where N = 2™ and we have defined energies so that trH = 0. 

Consider the case where the Hamiltonian for the qubits is that of non-interacting distinguishable partides with 
energy eigenstates |0) and |1) and energies — e, and +ej, respectively, for the i'th qubit. This is a good approximation 
for many spin systems in NMR, provided that the coupling constants are small compared to the chemical shift 
differences between the different spins. In this case, the energy eigenstates are close to the Standard computational 
basis states and the energy shifts due to coupling are small compared to the Larmor freqüències. The probability of 
the state \b) for bit string b = 6q^i ■ ■ ■ b n -i is given by 



Pbb 

with 



n—l 

Y[-(l + (-l)*6 t ) (12) 



i=0 



2^) = ^F^ ^ 
To first order, 5i ~ (3ei is the polarization of the z'th qubit. Thus we can write 

1 71 — 1 

= ^(i+E(-i) i,< ^)- ( 14 ) 



i=0 



where this first order approximation is vàlid as long as 5t = ^2™=o &i <C 1. 

Given the linear approximation to pbb, it can be seen that if b = (1 — òo)(l — &i) ... (1 — ò„_i) is obtained from 
b by flipping each bit, then pn + pbb = js. Thus to obtain an unbiased, uniform input from two experiments, it 
suffices to perform one experiment with no preparation step and one with all the qubits flipped in the preparation 
step, averaging the results. However, this eliminates effective polarization in the ground state as well as all the other 
states. 

To retain the ground state polarization we can perform two experiments. In the first, the thermal input state is 
used without modification by applying preparation operator Pq = I. In the second, the preparation P\ consists of 
first inverting each qubit by applying a x and then swapping the ground state |0) with the state |1) (all qubits in state 
|1)). The average of the two prepared states is given by 

p s = i(l + 5 t (|0)(0|-|l)(l|)). (15) 

There are two methods for eliminating the remaining polarization in |1). The first, randomized flip&swap, uses 
randomization to average this polarization over all non-ground states. The second, labeled flip&swap, uses one of the 
qubits as a logical label, following the method of |ï|,^| 

The simplest randomization method involves first selecting a random non-ground state |í>) and applying a unitary 
operation R which maps |1) — > \b) and leaves the ground state unchanged. Both preparation steps are modified 
by adding this unitary operation after the flip&swap and before the computation. To improve the signal to noise 
ratio, the whole procedure can be repeated several times. R can be implemented efficiently using at most n — l 
controlled-nots. 
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The signal to noise ratio for a randomized flip&swap now depends not only on the initial polarization of the ground 
state, the computation and the sensitivity of the measurements, but also on the contribution to the variance from the 
random choice of R. The detailed calculations of this variance will be given in the appendix. If all the polarizations 
Si are the same, Si — 5, we dehne SNRi = 5/ ' s ( the signal to noise ratio for a measurement ai 1 ' of the thermal state), 
and a lower bound on the signal to noise ratio is given by 

n larlSNRi , x 

SNR>- mi = ( 16 ) 



yT/2 + n 2 SNR2/(2™(2« - 2)) 



Graphs of the behavior of the signal to noise ratio of this and the other methods are given in Figure 0. For small n, 
the limited number of possible random choices results in a significant reduction in the signal to noise ratio. However, 
a reasonable number of repetitions of the experiment can still reliably determine any bias in x, if x is not too small. 
An improvement in the SNR can also be obtained by using randomization over the normalizer group as discussed in 
Section This is called fully randomized flip&swap. 



Exhaustive averaging 
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FIG. 2. Graphs of lower bounds on the signal to noise ratio for the difïerent averaging methods for two or more identical 
non-interacting qubits at high temperature, and \x\ — 1. The bounds hold for a one qubit signal to noise ratio of 10 3 . The 
signal to noise ratios are for one experiment in the case of randomization over a group, two in the case of flip&swap and 2 n — 1 
in the case of exhaustive averaging. The noise is due both to experimental sensitivity and contributions from randomization 
(except for labeled flip&swap and exhaustive averaging, which involve no randomization). Repeating the experiments k times 
with independent random choices increases the signal to noise ratios by a factor of fe 1 ' 2 . 

Labeled flip&swap requires n + 1 qubits and applies the flip&swap operation to all them. Instead of removing the 
polarization in |1) by averaging, it is exploited by using the n+ l'th qubit as a label similar to the methods introduced 
in P]. This method was discovered independently by D. Leung. Conditionally on the n + l'th qubit being in state 
|0), the first n qubits are in an effective pure state with excess probability in |0). Conditionally on the n+ l'th qubit 
being in state |1), the first n qubits are in an effective pure state, but with a deficiency in Both experiments' 
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preparation steps must be followed by an operation which conditionally on the n + l'th qubit flips all the other qubits 
to turn the conditional deficiency in |1) into one in |0). After the computation is complete, the deficiency can be 
turned into an effective excess by conditionally reversing the sign of the answer. The full network for n = 3 is given 
in Figure 0. 
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FIG. 3. Quantum networks for the two experiments to implement labeled flip&swap for three computational qubits. The 
readout operation on qubit A is shown explicitly as a triangle. Filled circles denote conditioning on |1 >, while unfilled circles 
denote conditioning on |0 >. 



The signal to noise ratio for labeled flip&swap is given by 



SNR= V2(n + l) N SNR v 



V. RANDOMIZATION OVER GROUPS 



Exhaustive averaging is useful for small numbers of qubits and flip&swap works for nearly non-interacting qubits 
at high temperatures, ff the number of qubits and the polarization satisfy nó ~ 1 or if the initial state does not have 
approximate inversion symmetry, it is necessary to consider other methods which are both reasonably efficient and 
can be applied to arbitrary initial states. Randomization based on groups of unitary operators has this property. 

In general, randomization involves choosing a preparation operator P according to a predetermined probability 
distribution. To ensure that the expected value of the measurement represents the output of the computation on an 
effective purc state, wc require that F>xp P (PpP^) — p is an effective pure state. The methods to be discussed satisfy 
that 

P= (poo-p)\0)(0\+pI, (18) 

with p — jTzrï Yli>i Pa- ^ is desirable that the initial state p has excess probability in the ground state. If possible, the 
true initial state should be transformed by a unitary transformation which guarantees that the maximum probability 
state is the ground state, and that the density matrix is diagonal in the computational basis. (For nearly uniform 
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mixtures of states and high sensitivity, it may be more efRcient to have a sufficiently large deficiency in the ground 
state.) 

Let a = C^aPc, so that x = tr(|0> (0|ct) = opo- A single experiment with randomized preparation yields the 
measurement r(P) — tr(PpP^a) with variance s 2 ; the expectation of r(P) is given by f = (poo — P)%- The signal to 
noise ratio for a single run of the computation is determined by comparing the variance v of r(P) to f 2 . Thus, the 
signal to noise ratio is 

SNR(P,C,p) = Jll (19) 

vs + v 

If, for example, we wish to learn the expectation of f to within f(l ± e), the number of experiments required to 
achieve confidence c is proportional to log(l/c)/(e 2 SNR(P, C, p) 2 ) in the Gaussian regime. Due to the large number 
of choices in the randomization it is reasonable to expect that this regime applies even for one experiment. If this 
is were not the case, the average would need to be inferred by techniques robust against outliers. If we are only 
interested in learning the sign of f with confidence c, this can be done with ~ log(l/c)/SNR(P, C,p) 2 experiments, 
regardless of the actual distribution. One method is to use the sign of the median of the fej averages of the results 
from k 2 = max(l, 4/SNR(P, Cp) 2 ) independent experiments. Because the probability of the event that the average 
of &2 experiments has the wrong sign is bounded by 1/4, the probability of failure is < e~°( kl >. The constant in the 
exponent can be obtained from the Chernoff-Hoeffding bounds (ï(]] for the probablity of having more than 1/2 heads 
in fci flips of a biased coin with the probability of head given by 1/4. 
To compute the variance of r(P), define 

p = p- ExppPpP 1 
= p-pl-(poo-p)\0)(0\. (20) 

Then 

v = ExpptríP/pV) 2 
= Expptr^PpP 1 <g> PpPÏ)(<r <g> cr)) 

= tr(Exp P (P / oP t ®P/5P t )(cr®cr)). (21) 

Thus to ensure that r is as desired and to compute v, we first verify that Exp P (PpP T ) = and then compute 
Exp P (PpPt ®PpP t ). 

In the algorithms described below, P is a random product of operators, each chosen uniformly from various groups 
of unitary operators. The desired expectations can often be computed in closed form if P is a random element of a 
unitary group G. For this purpose, it is convenient to use the representations of G defined by tti(P)(A) — PAP' and 
TT2(P)(A (8 B) = PAP^ (g) PBP* , where ^(P) is linearly extended to all four-tensors. Both 7Ti and 7T2 are unitary 
representations of G for the usual inner product of operators and four-tensors: (A, B) = tr(AB) and (A® B,C (£> D) = 
tr(AC)tr(CD), with the latter inner product extended bilinearly to all four-tensors. Using this representation, for P 
sampled uniformly from G, it follows that the expectations can be obtained by projecting p and p® p onto the trivial 
eigenspaces of tti and tt2- Specifically, let IT and IT be the projection superoperators onto the space of all A such 
that tïi{P)A — A and onto the space of all B such that iT2(P)B = B, respectively. Then Expp eG 7Ti(P)Aí = Ii\A 
and Expp eG 7T2(P)P = ITP. We use this to calculate variances resulting from averaging over four groups, below. 



A. Diagonal Groups 

If the initial density matrix is not diagonal and it is not feasible to perform the unitary transformation which 
makes it diagonal in the computational basis, one can use randomization over a diagonal group to reduce the effect 
of the offdiagonal entries. Let V be a group of diagonal operators Sf : \j) — > with f(j) £ {0, 1,2,3}. To 

ensure sufficiently small trivial eigenspaces for the representations tï\ and 7T2, we require that the following phase 
independence condition holds: If f(j) — f(k) + /(/) — f(m) — mod(4) for all /, then j = k and l = m or j = m 
and k — l. We call a group with this property a diagonal group. Randomization over D is accomplished by choosing 
a member of T> uniformly and applying it to the initial state. Although the expectation of the randomized density 
matrix is not yet an effective pure state, it does reduce the off-diagonal contributions to the expectation and the 
variance. For example, 
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N-l 



Exp Pe23 PpPt = P«\*M i 22 ) 

i=0 

To obtain an effective pure state, additional randomization steps are required. The expectations needed for computing 
variances are calculated in the appendix. An efficiently implemcntable diagonal group T> can be obtaincd as a subgroup 
of the normalizer group introduced below. 

B. Two-transitive Permutation Groups 

Let T be a two-transitive group of permutations acting on the set of states |1), .. . , \N — 1). By definition, for every 
i =/= j and k 7^ l, there is a permutation tt S T such that — k and — l. Then 

^v PieV .p 2eT P2PipPM = (Poo -P)|0)<0| +pl, (23) 

which is the dcsired effective pure state. An effective pure state would be obtained on average even with a one-transitive 
group, such as the cyclic permutations used for exhaustive averaging. However, the variance for one-transitive groups 
can be quite large and two-transitivity helps in reducing it. 

To give the upper bound on v for randomization with T> and T, define 

Pd = ^2/ki\i)(i\, (24) 

Z>1 

Then 

v < tv(p 2 d ) + ^l^tr(p 2 ). (25) 

The derivation of this inequality is in the appendix. In the high temperature regime, this implies a signal to noise 
ratio of at least 

SNR>H ■ '*' SNRl (26) 



1 + nSNRÍ/(2» - 2) 

Efficiently implementable two-transitive permutation groups can be obtained from the normalizer group. 

C. The Normalizer Group 

The normalizer group J\í, more specifically, the normalizer of the error group, consists of all unitary operations U 
which satisfy that for any tensor product of Pauli operators a, UaU^ is also a tensor product of Pauli operators (up 
to a phase factor). If the Pauli operators are labeled by 000 = !■> "oi = &zt&io = = o y and, for example, 

0101101 = crio^iTii^croií then the elements of the normalizer group are characterized by UabU^ = (—l)^ x ' b ' í i^ b ' L " , an l , 
where x is an arbitrary bit vector, (ce, b) denotes the inner product modulo 2 of bit vectors and L is an arbitrary 
invertible (modulo 2) 0-1 matrix which satisfies L T ML — M, where Mb is the bit vector obtained from b by swapping 
adjacent bits belonging to the same factor. The exponent /(fe, L) depends only on fe and L; its vàlues are not needed 
for the present analyses. The group of matrices L with this property acts transitively on non-zero bit vectors. The 
normalizer group yields scvcral subgroups useful for randomization. 

Linear Phase Shifts. 

The group T> generated by controlled-sign flips and the operator S =\\ ? ) acting on any qubit consists of 



v * , 

diagonal operators with action \k) — > i^ k )(— i)( fc .- Bfc > |/j) ) where x is a vector with entries in {0,1,2,3} and B is an 



arbitrary n by n 0-1 matrix. To check that the phase independence condition (Section VA) holds, suppose that for 
all x and B = yz T , 

x T (k — l + m — n) + 2{k T yz T k — l T yz T l + m T yz T m — n T yz T n) = mod(4). (27) 

(28) 
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This implies that k — l + m, — n = mod(4). If k = m, then l = n = k, since k, l, m and n are all 0-1 vectors. If 
not, without loss of generality, suppose that k 7^ 0. To derive a contradiction, suppose also that k ^ l and k ^ n. 
If k is not in the span (modulo two) of l,m and n, then there exists z orthogonal (modulo two) to l,m and n, but 
not k, which contradicts the equality above. Thus k must be in the span of l, m and n. If k is not in the span of 
two of them, say l and m, then there exists z orthogonal to l and m but not k and a y orthogonal to n but not k. 
Again, we find that the equality cannot hold. Thus it must be the case that k = l + m mod(2), k = l + n mod(2) and 
k = m, + n mod2. This implies that m = n = l and k — 0. Thus the desired independcnce condition holds. 

Linear Cyclic Permutations. 

A group S acting cyclicly on the set |1), . . . , \n) is obtained by representing the field GF(2") as a vector space over 
GF(2) with elements represented by bit strings of length n in some basis. Multiplication by non-zero elements of 
GF(2") defines a cyclic subgroup of C of order 2" — 1. 

Linear Permutations. 

The group T of linear permutations is generated by the controllcd-not opcrations. The group consists of the unitary 
t/'s which satisfy U\b) = \Lb), where L is an invertible (modulo two) 0-1 matrix. The group acts two-transitively on 
the set |1),...,|JV>. 



D. Conditional Normalizer Group 

Randomization over the normalizer group is as effective for variance reduction as is randomization over the unitary 
group. The main difficulty is that the normalizer group does not fix |0). This can be remedied by alternating 
randomization with T and with the conditional normalizer group Afí which acts on the first n — 1 qubits given that 
the last one is in state |1). 

The first step in the procedure is to randomize with T> (if nccdcd) and T. Each following step involves randomizing 
with A/"i and then with T. The total number of steps determines how effective the randomization is. The procedure is 
designed such that the expectation of the resulting density matrix is the desired effective pure state after every step. 
The variance Vk+i after the fc'th step can be estimated by (see the appendix): 

w fc+ i < A fc ^tr(^) + ^p 2 , (29) 

with A = e 1 ^ /2. In the high temperature regime this implies a signal to noise ratio of 

SNR> £ NSNRl (30) 

A ^/l + 2nSNR 2 /(2™(2"-l)) 

where k was chosen such that A fe < 1/2(2" + 2). 



VI. EFFECTIVE PURE STATES BY ENTANGLEMENT 



The temporal randomization mcthods discusscd above are useful when the device is qubit limited, in the sense that 
it is difficult to access additional qubits. It is important to realize that ancillary qubits involved only in preparation 
and postprocessing generally do not need to have long decoherence or relaxation times. For example, if they are used 
only in the preparation phase, their quantum coherence does not need to be maintaincd in computation or readout. If 
such ancillas are available, they can and should be used to simplify the effective pure state preparation. Interestingly, 
if an additional n qubits are available, it is possible to prepare a nearly perfect effective pure state for any diagonal 
initial state by exploiting entanglement. 

Hcre is an explicit algorithm which results in an effective pure state on the first n qubits given 2n qubits. The bàsic 
idea is to map the computational basis states other than the ground state on the first n qubits to nearly maximally 
entanglcd states. Write a computational basis state on the 2n qubits as |a)|ò), where a and b are length n bit vectors. 
Let x be a generator of the multiplicative group of non-zero elements of GF(2"). The desired unitary transformation 
is the composition of the maps 
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P 1 :\a)\b)^ ^(-l)<^>|a)|c) 5 (31) 

C 

P 2 :\a)\b)^\ax b )\b), (32) 

where ò is interpreted as a bit vector in the first exponent and as a binary number in the second. Consider the reduced 
density matrix g a t, on the first n qubits derived from the state P2Pi\a)\b). If a ^ 0, 

e«6=^(/-|0)(Q| + |o)<a|) (33) 
eo6 = |0)(0|. (34) 

This is nearly an effective pure state. If p is the reduced initial density matrix on the first n qubits and p is diagonal, 
then after applying P2P1, the reduced density matrix is 

^((Poo-p)\0){0\+pl) + ±p. (35) 

The deviation from the effective pure state is sufHciently small to be of no concern in most cases. 

Entanglement can be exploited even if less than n additional bits are available. In fact, essentially the same 
algorithm works. However, the deviation from an effective pure state becomes larger and residual bias must be 
removed by another technique such as randomization. In general, if ancillary qubits are available, the effectiveness of 
averaging methods can be improved. For example, we can randomize the states \a)\b) with a ^ with the subgroup 
C m of the group of linear permutations which preserves the subspace {|0)|6)}. If this does not reduce the variance 
enough, a version of the conditional normalizer randomization method can be used, where C m is used instead of the 
full group of linear permutations. 

Ancillary qubits are likely to be available whenever a computational cooling method is used to increase the proba- 
bility of the ground state in some of the available qubits. Computational cooling uses ancillas and in-place operations 
to transfer heat from the computational qubits to the ancillas. The simplest such methods are based on decoding a 
classical error-correcting code in-place and exploiting the fact that the thermal state is equivalent to a noisy ground 
state. 



VII. EXPERIMENTAL EVIDENCE 



The temporal randomization methods can find immediate application in NMR quantum computation, even with 
simple molècules, as we demonstrate with the following experimental results utilizing exhaustive averaging to extract 
an effective pure state from a two spin system. 

Using a model two spin system, we prepared an effective state similar to that of Eq.(||) from a thermal state. This 
was done by implementing the quantum circuits shown in Fig. H to perform the permutation of Eq.rïq) and its inverse. 



B 



A' A 



®—A> 



B' 



?2 



FIG. 4. Quantum circuit implementation of the permutations P\ and P%. 

The two-spin physical system used in these experiments was carbon-13 labeled chloroform (Fig. ||) supplied by 
Cambridge Isotope Laboratories, Inc. (catalog no. CLM-262), and used without further purification. A 200 millimolar 
sample was prepared with d6-acetone as a solvent, degassed, and flame sealed in a Standard 5mm NMR sample tube, 
at the U.C. Berkeley College of Chemistry. 
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Cl 

I 

ci -c- ci 
i 

H 

FIG. 5. Molecule of chloroform: the two active spins in this system are the 13 C and the 1 H. 

Spectra were taken using Bruker AMX-400 (U.C. Berkeley) and DRX-500 (Los Alamos) spectrometers using Stan- 
dard probes. The resonance freqüències of the two proton lines (in the DRX-500) were measured to be at 500.133921 
MHz and 500.134136 MHz, and the carbon lines were at 125.767534 MHz and 125.767749 MHz, with errors of ±1 
Hz. The radiofrequency (RF) excitation carrier (and probe) freqüències were set at the midpoints of these peaks, so 
that the chemical shift evolution could be suppressed, leaving only the 215 Hz J-coupling between the two spins. The 
T\ and Ti relaxation times were measured using Standard inversion-recovery and Carr-Purcell-Meiboom-Gill pulse 
sequences. For the proton, it was found that T\ « 7 sec, and T2 ~ 2 sec, and for carbon, T\ » 16 sec, and T2 ~ 0.2 sec. 
The short carbon T2 time is due to coupling with the three quadrupolar chlorinc nuclci, which shortens the coherence 
time. Nevertheless, these time scales were all much longer than those of the operations applied, guaranteeing that we 
could implement quantum transforms and observe quantum dynamics. 

We performed quantum state tomography to systematically obtain the final quantum state; this procedure will be 
described in detail elsewhere [ ]ïl"| |. In each tomography procedure, nine experiments were performed, applying different 
pulses to measure all the possible elements in the density matrix in a robust manner. The resulting deviation density 
matrix for the thermal state is shown in Fig. ^|A. As expected, all the off-diagonal elements are nearly zero, while the 
diagonal elements follow a pattern of a + b, a~b, ~a + ò, and — a — b. An error of about 5% was observed in the data, 
due primarily to imperfect calibration of the 90° pulse widths and inhomogeneity of the magnètic field. 



90 , 
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FIG. 6. NMR pulse program implementations of the permutations Pi and P%. Each RF pulse was about 10 microseconds 
long, and the time between the pulses was about 2.3 milliseconds. 

The two permutation quantum circuits were implemented using the pulse programs shown in Fig. ^|. Because of the 
absence of phase correction steps in the controlled-not gates |2| , the actual transforms implemented were not exactly 
those of Eq.(|5|), but rather, 



Pi 



Pi 
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1 
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(36) 



(37) 



For the purposes of temporal randomization of an initially diagonal density matrix, the phases of the transformations 



12 



can be ignored. We obtained the density matrices shown in Fig. ^B-C from these two transformations. The effective 
pure state we obtained was approximately 



194 e e e 
e 24 e e 
e e e e 
e e e 8 



57/. 



(38) 



where |e| < 5.4. An error of ±5 was calculated, based on analysis of the linewidth integration, least squares fitting 
used in the tomography procedure, and Standard error propagation. This result compares favorably with the result 
expected from Eq.(|J). Further work has been done to use this state as an input into a non-trivial computation; that 
work demonstrates the creation and manipulation of effective pure states which are in superpositions, and will be 
reported elsewhere |ïï| ]. 





(A) 



(B) 



(C) 




(D) 




FIG. 7. Experimentally measured deviation density matrices for (A) thermal state, (B) state after P\ operation, (C) state 
after P2 operation. (D) Effective pure state (biased sum of the three). Real components only are shown; all imaginary 
components are small. 
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VIII. CONCLUSION 



We have described new techniques for creating cfïcctive pure states which complement the logical labeling and spatial 
averaging techniques previously discovered. Our temporal averaging methods are uniquc in thcir use of summation 
over experiments carried out at different times and powerful by virtue of averaging over transformations chosen 
systematically (in the case of labcled flip&swap) or randomly (for randomization over a transformation group). 
The choice of temporal averaging method in an experiment depends on the number of qubits available, how many 
are required for computation, the initial density matrix and the desired signal to noise ratio. A summary of our 
recommendations based on the analyses in this manuscript follows: For small numbers of qubits exhaustive averaging 
can be used for any initial density matrix which is diagonal in the computational basis. If the initial state is close to 
that of non-interacting partides at high temperature, the flip&swap techniques can be used. If a non-computational 
qubit is available, thcn labcled flip&swap is the simplest and most efhciently implcmentcd method. Asymptotically 
it requires a linear number of quantum operations, and unless high signal to noise ratios are needed, involves many 
fewer experiments than exhaustive averaging. In terms of quantum operations, exhaustive averaging appears to be 
more efhcient up to at least four qubits. The actual minimum number of qubits for which labeled flip&swap uses 
fewer quantum operations per experiment then exhaustive averaging depends on the implementation and remains to 
be determined. If every qubit is required for computation, then randomized flip&swap can be used at a cost of more 
quantum operations per experiment. For large numbers of qubits where the high temperature regime or the non- 
interacting assumption fails, randomization over a group can be used. If ancillary qubits are available, randomization 
can be combined with entanglement. It remains to be seen whether this situation will be encountered in practice. 

Future theoretical work will investigate combinations of logical, spatial, and temporal labeling techniques, and estab- 
lish a connection between these procedures and error-correction. Experiments will also be performed to demonstrate 
the different techniques with large molècules and to explore their relative mèrits in practice. 
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APPENDIX A: CALCULATIONS OF VARI AN CE FOR RANDOMIZATION OVER GROUPS 



The expectation and variance of the outcome of an experiment using randomization over a group G can be de- 
tcrmined from the trivial eigenspaces of the representations : ni(U)(A) — UAW and 112 ■ tt2(U)(A (£> B) = 
UAU^ ® UBW. In the next sections, these eigenspaces are determined and the resulting variances estimated. We 
begin with some calculations for the diagonal groups. 



1. Diagonal Groups 



Let T> be a diagonal group as defined in Section V A . This group is used to diagonalize the average density matrix 

before randomizing with more powerful groups. We compute the projections onto the trivial eigenspaces of both 
representations -k\ and ~ki- 

Exp Peü P|i)<j|pt =5 ..| i )( i | ) (Al) 

and 

Expp eü P|i)(i|pt g, p|fc)(fc|F t = ® \k)(k\, (A2) 

Exp PeP P|z)(fc|P t $ P\k)(i\PÏ = \i){k\ (8 (A3) 

Othcr expectations of <8> are 0. The projections of p and p®p onto the trivial eigenspaces of ~k\ and 7T2 are 
therefore given by 

Exp Feí ,PpPt = ^ / g ii |ï)(í|, (A4) 

Exp PeI ,PpPt s Pppt = fia\i){i\ ® toliXjl + ^ Pij|ï)<j| ® ftili)^!. (A5) 

(A6) 

Unfortunately, it is impossible to completely eliminate the contributions of the off-diagonal elements of p to the 
variance by this method. As will be seen, to reduce the effect of these contributions it is necessary to ensure that 
p is approximately diagonal by an initial unitary operation, or to design the algorithm so that a is approximately 
diagonal (as will be the case if the output of the algorithm is deterministic when given one of the computational basis 
states for input). 

The calculations for the other groups to be presented below assume that p has already been randomized by a diagonal 
group. As a result, only the subspaces spanned by \i){i\ (for tti) and by <8> \j){j\ (for 7r 2 ) and <8> \j){i\ will 
be considcrcd in our analysis. 

2. Two-transitive Permutation Groups 

Let T be a two-transitive permutation group which fixes |0). It is straightforward to check that for i 7^ j 

Exp PeT P|i)(i|pt = -±-J^\i') {i 'l (A7) 

i' 

Exp Per PK)0-|pt = __J__ m'\, (A8) 

where the indices in the sums range from 1 to JV — 1. This convention for indices and labels will be in place for the 
remainder of the appendix unless otherwise indicated. The relevant part of the trivial eigenspace of tt\ is spanned by 
/ = Y1í'>i an d & n operator with no diagonal entries. For i j, 
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Exp^P^^ ® P|jXÍl^ 



(N-l)(N-2)fé, 



Exp Per P|z)0-|pt P| i)( i|pt = (jv _ i) 1 (jv _ 2) ^ |i')</| ® l/Xi'l, 

Exp Per P|0)(. 7 |pt P|j)(0|pt = — L- £ \0)(j'\ ® |/><0|, 

i' 

Exp Per P|. 7 )(0|pt ® P|0)(j|Pt = ^ |/)<0| ® |0)(j'| 



(A9) 
(A10) 
(All) 
(A12) 
(A13) 



and expressions involving other combinations of indices which will be of no further concern. The relevant part of the 
trivial eigenspace of 7r 2 is therefore spanned (non-orthogonally) by 



£ = 5>'><t'i<s 

i' 




(A14) 


È = J2\i')(i'\^ 

i', i' 


M/Xi'l, 


(A15) 


./=£iwi< 


3 1/Xi'l, 


(A16) 


^i = 2|0)<i'|(g 

i 1 


)|i'X0|, 


(A17) 




)|0)<i'|. 


(A18) 



Dcfine 

Po = ^Po»KX l +Pío\0)(í\, 

i>l 

Pa = P~ Po- 
lí P is a random product of operators in T and in a diagonal group 2?, then 

Exp PiePjP2er P 2 P 1 pP 1 t P 2 t = ^JL_ Ç = 0, 

i 

^Vp^d,p^tP^PipPM ® P2P1PPM = j^j E <^ 

+ (A,-l)(A/-2)g^( J -^ 
~T7 PoíPío{Zi + ^2) 



N- 



= — tr(^)r, 
1 



+ 
+ 



(N — í)(N — 2) 
1 

(AT- l)(AT-2) 



(tr(p ) 2 -tr(^))(P-^) 
(tr(p?)-tr(^))(J_£) 



1 

2(N - 1) 



tr(^)(Z 1 + Z 2 ). 



(A19) 



(A20) 



(A21) 
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The variance v is obtained by taking the inner product of this expression with a ® a. Dcfinc 

àd = ^2a u \i)(i\, ( A22 ) 

i 

<r = ^^1^(01 +<7 i0 |0)(i|, (A23) 

i 

àò = a-à o -a oo \0){0\. (A24) 

We will make use of the following (in)equalities: 

trp = trpg 
= trp rf 

= 0, (A25) 
tr(p 2 ) = tr(p§) + tr(p 2 ) + p 2 00 + (N - l)p 2 

< 1, (A26) 
tr(èrg) = -croo, (A27) 
tr(cr 2 )=tr(^)+tr(^) + ^ 

= N, (A28) 
tr(CT 2 ) + 2cr 2 = 2, (A29) 
tr(<r 2 ) < tr(a 2 ) 

< N — 1, (A30) 

where we used the properties of the trace inner product and the fact that a is unitary. The variance can now be 
estimated by 

' tr(p 2 )tr(«r 2 ) 



N- 1 

1 



(iV- l)(JV-2) 



(tr(p d ) 2 -tr(p 2 ))(tr(à d ) 2 -tr(a 2 )) 



' (tr(p|)-tr(p 2 ))(tr(^)-tr(^))) 



(N- í)(N-2) 

+ rtrfpn)tr(àn) 

2(N-1) y ^ ' K 

<tr(p 2 ) + ^tr(p 2 ). (A31) 

Both of the terms in this expression can be large compared to f 2 . The presence of the second term shows the 
importance of ensuring that p is initially in a nearly diagonal form, and implies a limit on the effectiveness of the 
diagonal group. However, if a is diagonal in the computational basis, the second term does not arise. 

The signal to noise ratio for the thermal distribution can now be obtained as follows. With the dcfinitions from 



Section IV 



N-l 



tr (p 2 ) = E (pu-p) 

i=l 
N-l 

i=0 

N-l 1 

— E Pa — 

i=0 

- 1 1 

=rii(( i +^) 2 +( 1 -^) 2 )-^ 
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1 " 

=^(n( i +^ 2 )- 1 ) 

i=i 

<^( eE< * ? "I). (A32) 
trp 2 >^E^- (A33) 
The last expression is a good approximation as long as J2 i ^1^1* The probability of the ground state is given by 

i=l 

^è( 1 + E^)' (A34) 



which is a good approximation as long as J2i $i ^ 1- Thus the signal to noise ratio for randomization using a 
two-transitive group is bounded by 

SNR > Ei^ool (A35) 

" v / 2 2 «s + 2«(l + l/(2™-2))£^ 

To undcrstand the behavior of this expression, consider the case where Si = 5 is independent of the qubit. We express 
s in terms of the signal to noise ratio SNRi for a single qubit, SNRi = S /y/s. For a typical NMR experiment with 
protons, SNRi <~ 10 3 . With these definitions, 

n\a 0Q \S 



SNR> 



2 2n 5 2 /SNRÍ + 2 n (l + 1/(2™ - 2))nS 2 



> ; SNRl|g ° o1 (A36) 

1 + nSNR?/(2" - 2) 

For small n, SNR is dominated by the contribution to the variance from the randomization process, whilc for large 
n, it is dominated by the reduction in excess probability in the ground state. 



3. Cyclic Permutation Groups 

Consider using a cyclic group S\ of permutations which leave |0) fixed. This was done for exhaustive averaging, 
but can also be applicd to randomization. As we will see, the main problem is that the variance of the measurements 
cannot be guaranteed to be sufficiently small. Lct ir be a generator of the group of ordcr N — 1. 

The trivial eigenspaces of S\ can be computed as in the previous section. The relevant subspaces are spanned by 
I for 7Ti and 

D k ±J2\i){i\®\n k (j)){n k (i)\, (A37) 

i>l 

^El^WI^WM' (A38) 
»>i 

Z\ , Z2 and a few othcrs of no further concern for tt 2 ■ 

Let P be a random product of an element of a diagonal group V and the cyclic group S\ . 

Exp PlGl , : p 2e5l P 2 Pi / 5P 1 t P 2 t = 0, (A39) 

N-2 



Exp PieX , : p 2e5l P 2 PipP 1 t P 2 t (»P2PipP 1 t P 2 t = E ^Üft-^WW^ 

fc=0 i 
N-2 



k=l 
1 



2(N - 1) 



tv(p 2 )(Z 1 +Z 2 ). (A40) 
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To compute v we take the trace after multiplying by o <g> u. 



fc=0 i i 

JV-2 



+ 



_ _ (0 

fe=l 
1 



ï tr(^)tr(a 2 ). (A41) 



The sum involves ofï-diagonal expressions and products of correlations of the diagonals of p and cr. Although w can 
be much too high in the worst case, in practice one can expect it to be close to what was obtained for a two-transitive 
group. However, since the known algorithms for the cyclic groups are no more efhcient than those for the linear group, 
there is presently little to be gained by using cyclic groups. 

4. The Unitary Group 

A spanning set of eigenvectors of the representation 7T2 of the unitary group U acting on |1), . . . , | N — 1) consists 
of È, J, Zi, Z 2 and |0)(0| ® |0)(0|. As a result one obtains 



Exp Pe[/ PpPt g, Pp p\ = __l__(( tr/ 5 d )2 + t r(pÍ))(È + J) 



' ((trp d ) 2 -tr(pl))(É-J) 



Thus 



2(N-1)(N - 2) 
+ ^ T ytr(^)(Z 1 + Z 2 ) 

NW^f^ )J - N(N-l)(N-2f ^ É + W^ïf^ + ^ 



(A42) 



< J^^iP 2 )- (A43) 

By using the unitary group, it is possible to eliminate the term trp^ that occurs in the exprcssion for v for the 
two-transitive permutation groups. Although it is impossible to cmcicntly implcment random elements of the unitary 
group, there are effective methods for accomplishing the same by using the normalizer group. 

5. The Normalizer Group 

The normalizer group is as effective at randomizing |0), . . . , | N — 1) as the full unitary group, at least in terms of 
expectations and variance. It is straightforward to determine the trivial eigenspaces of 7Ti and 7r 2 in the Pauli operator 
basis. For i ^ j and j ^ 0, 

Exp P&v Pcr í .P t = 6 ifi (To, (A44) 
Exp PGAr Po- i P t <g> PajPÏ = 0, (A45) 

Expp^Paj Pt ® Pa, Pt = —31 E G ï ® a i' ' ( A46 ) 

(A47) 
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where m is the number of qubits. Using these identities, it can be verified that the trivial eigenspace of 7Ti is spanned 
by the identity, and that of 7r 2 by E = / ® I and J = j>o KXí'l ® 

To exploit the normalizer group without removing the polarization in |0) requires conditioning it on one of the 
qubits. 

6. Conditional Normalizer Group 

In this section we analyze the behavior of the algorithm based on alternate randomizations using T and the 
conditional normalizer group Afí . 

Let Rk~i be the expectation of PpP^ <g> PpP^ after the fc'th step of the conditional normalizer group algorithm. 



Using Eq.(|A2l|), 



Ro = ~, ^7 r (iVtríp?) - tr(pi)) D + - ^ r(trfpl) - tr(óS)) J 

U (N-1)(N-2) K yFdJ yHa " (N -1)(N -2) V KH ° J yFdJJ 



(N — l)(N-2) 

+ 2(jv 1 _ 1) tr(^)(^i + ^ 2 ). (A48) 

Define a^, 7^ and S by R k = a k D + (3 k J + jkÉ + S(Zi + Z 2 ), where we have used the fact that [Z\ + Z2) are not 
affected by randomization with T and A/"i . 

Because J\f\ distinguishes the state of the first qubit, we need to subdivide the tensors in the expression for Rq. 
Write D = Dq + Dí, J = J Q0 + Jqi + J W + J11 and È = È 00 + È Q1 +È W +È n . For example, D = E^ 2-1 l*X*l ® l*)(*l> 
J01 = EiÏÏi 2 " 1 l*> 01 ® liX*l and ^10 = Ej!Íi _1 NX*I ® UXil) where we are using the convention that 

the indices i > 2™ _1 = iV/2 are those referring to states with the first qubit in state |1). Randomizing over A/i 
preserves all but one of these expressions: 

Exp Pe ^ i 7r 2 (P)(D 1 ) = j^Ju + ^T2^ii- (A49) 

(A50) 



Hence 



2 2 

Exp PeM 7r 2 (F)(íï fc ) = a k D + (3 k J + __ afc Jn + 7^ + e^-Eu + ó ( Z i + (A51) 



Randomizing over T gives 



iV - 


2 


2(iV- 


1) 


iV 




2{N - 


1) 


N 




4(iV- 


1) 


N 




2{N - 


1) 


N 





Exp PeT 7r 2 (P)( J D ) = , AT ^ D, (A52) 

TV 

Exp P6r 7r2(P)(f7ii) = OÍAT - ^ D + 4(jV _ 1} (£ - D), 

TV 

Z?+— - — -E (A53) 



4(JV- 


1) 


7V 




4(iV- 


D 


N 




A{N- 


D 


N 




4(N- 


D 



Exp PeT 7r 2 (P)(J 11 ) = — + 4(JV _ 1} (J - £>) 

4( JV -1) A + ^ L ^ JF - (A54) 



so that 



R k+1 = Exp P2er Exp PieM 7r 2 (F 2 Pi)(i? fc )) (A55) 
1 - KTTr TÜTr I 5^ + + o^r , oT^r n ^ J+(lk + níAT , ^ Ar TT ^ É + Í(2 X + Z 2 )). (A56) 



2(AT- l)(AT + 2)y V 2(7V + 2)(7V- 1) / \' 2{N + 2)(N - 1) 

The variance üfc+i after the fc'th step is given by 
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v k+ i = tr(R k+1 a <g> a) (A57) 
= a k+1 tr(à 2 d ) + /3 fe+ itr(a?) + j k+1 (tra d f + 5tr(a^). (A58) 

We can estimate the coefíicicnts as follows: 



N 
1 



a tr(<7 2 ) < ^_^tr(^), (A59) 



PM°l) < j^MpI) - ( A6 °) 
i 



7o(tró d ) 2 = - _ " _ 2 ^ tr(pg)(trg d ) 2 

< 0, (A61) 
5tr(«T 2 ) < ^ytrp 2 , (A62) 

«H-itrfa) = - \l + {N _ 1){N + 2) ) 

< ie^afetr^) 

<^e^^^tr(^). (A63) 
Dcfinc À = e ™+ 2 /2. The coeficients j3 k and 7fe are monotonically increasing. The limiting vàlues are 

/3ootr(«r 2 ) = (/3 + -^a )tr(óg) 

< ^ tr (/5o)' (A64) 

7ootr(<r d ) 2 = (-^ao + 7o)tr((T (i ) 2 

< 0. (A65) 

Thus 

v k+1 < A fe ^tr(p 2 ) + ^tr(p 2 ). (A66) 

By choosing k large enough, the variance can be reduced to near that obtainable by randomizing over the wholc 
unitary group. In fact, if k is chosen so that X k < 1/(2(N + 2)), then the maximum contribution to the variance is 



v < jj—jtr(p 2 ). Consider the case where p is diagonal with p o maximal, c = y/s/(pao — p) and the output of the 



algorithm is deterministic (i.e. a 2 = 1). Then jfzïP 2 < 2p(poo — p) and 

Poo -P 



SNR> 



y/s + 2p( ( o o - P) 



> VPqo ~ P (A67) 
\/c 2 p a + 2p 

Consequently, if p <C cpoo, the signal to noise ratio is dominated by i, the term due to measurement noise. If 
p 3> cpoo, then the signal to noise ratio is determined by the contribution from the randomization method. As long 
as p is sufficiently smaller than poo and c < 1 the signal to noise ratio is bounded below by a constant, which ensures 
that a small number of experiments are required to determine whether ooo = 1 or ooo = — 1. However, in the case 
where p ~ poo, the signal to noise ratio can be very small, for example if pa = or pu = poo for all i. The situation 
where p <~ poo is small ariscs in the high temperature limit of NMR quantum computation. In this case the signal to 
noise ratio can be estimated as 

SNR > jL SNRl|fT ° o1 (A68) 

" '\ + 2nSNR 2 /(2"(2" - 1)) 
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7. Randomized Flip&Swap 



For fully randomized flip&swap, each experimental determination of the output of the computation consists of two 
experiments. First a sequence of k random operators implementing the conditional normalizer method is chosen. For 
the present purposes we choose k so that X k < 1/2 (N + 2). Next two experiments are performed. In the first the 
chosen sequence of operators is applied before measuring a. In the second, the flip&swap operation is used before 
applying the same sequence of random operators and measuring er. The measurements are added to obtain the dcsircd 
answer. 

This algorithm behaves exactly like a single randomized experiment with input p s (Eq.(|Ï5|)) and measurement 
variance s/2. The variance of the randomization is therefore given by 

v < j^Wl) (A69) 

1n 2 r) 2 

£ N^T Y < A ™> 

Substituting in the expression for the signal to noise ratio gives 



SNR > " - SNRl|fT ° o1 (A71) 



1/2 + 2n 2 SNRÍ/2 2 "(2™ - 1)) 

where we have taken into account the fact that two experiments contributed to the signal. 

Instead of using the conditional normalizer group, one can use any set of permutation operators {Pi}^^ 1 with 
Pi\0) = |0) and Pi\N — 1) = For example, a cyclic linear group can be relabeled to have this property. Because 
of the symmetries of p s , this is as effective as using a two-transitive group. Since tr(p 2 ) < n 2 S 2 /N 2 , 

(2 n - l)n 2 S 2 , , . 

B * 2^(2» -2) (A72) 



and 



SNR > ; SNRl|g ° o1 (A73) 

l/2 + n 2 SNR?/(2™(2™-2)) 



APPENDIX B: IMPLEMENTATIONS OF TEMPORAL AVERAGING ALGORITHMS 



1. Flip&Swap 

The implementation of labeled flip&swap for three qubits and an ancilla is shown in Figure [| The flip&swap is 
the first group of gates, consisting of a not applied to each qubit, followed by controlled-nots from the first to each 
of the other qubits, an n — 1-controlled-not conditioned on the last n — 1 qubits being |0), and finally a reversal of 
the first set of controlled-nots. Efficient quantum networks for the n — 1-controlled not (generalized Toffoli gates) are 
given in Note that for diagonal initial states, phase variants are equivalent, so we can use an SU variant of the 
Toffoli gate to avoid ancillas while still having an 0(n) implementation. Also, the computation can be arranged so 
that it is 0(n) even if controlled operations can only be performed between adjacent qubits in a linear ordering. 

An efficient method for implementing randomized flip&swap is to choose for each |6) ^ |0) an "easy" linear operator 
L modulo 2 such that LI = b. If b has w one's, such an operator with at most n — w off-diagonal ones exists. The 
corresponding unitary operator in the group of linear permutations can be implemented with n — w controlled nots. 



2. The Normalizer Group 

Every element of the normalizer group TV operating on n qubits can be implemented by 0(n 2 ) controlled-nots and 
■k/2 or 7r rotations of single qubits. For the purposes of randomly choosing one of the members of M, the natural 
representation of U £ Àí is 
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0\Lb- 



A uniform random element can be obtained by choosing x and L uniformly subject to L T ML — M (see Section [V C] ) . 
The vector a; is obtained by setting each of the 2n entries of x independently and uniformly to or 1. To obtain 
uniformly distributed vàlid L's one can construct L column by column. Write 



where the entries are n by n matrices and the partitioning is based on writing the index b of a b in the form b = feo&i, 
with bo and b\ containing the indices coming from the first and second members of each qubit's pair, respectively. 

If L<k is the 2n by k matrix consisting of the first k columns of L, then L< fc ML<fc = M<k.<k, where M<k,<k is 
the k by k matrix submatrix of M in the upper left còrner. The columns of L<fc are linearly independent (modulo 
2). Suppose L<k has been constructed and we wish to add another column to obtain L<k+i- The new column L k +i 
has to satisfy 



The first equality is satisfied for any L k +i, so we wish to choose Lk+\ randomly, not in the span of L< k and subject 
to the second equality. The dimension of the affine space of solutions to this equality is 2n — k, whilc the dimension 
of the span of L<fc is k. We consider two cases. If k < n, then Mk+i,<k — 0, and the span of L<£ is contained in 
the space of solutions. Because 2n — k > k, suitable Lk+i can be found. To pick Lk+\ uniformly one can use any 
algorithm (e.g. one based on Gaussian elimination modulo 2) to obtain 2n — 2k vectors S'í, ... , S271-2/C independent 
of the columns of L< k which together with L<fc span the solution space. A random L^+i is obtained by choosing 
a random non-zero linear combination of the Si, ... , S2n-2k and adding it to a random linear combination of the 
columns of L<k- 

If fe > n, then M k +i.<k is non-zero. If y is in the span of L<fc, then the k — n + l'th entry of y T ML<k is zero. 
Since that entry of M k +i : <k is 1, the set of solutions to y T ML< k = AIk+i.< k does not contain any element y in the 
span of L<k- We can therefore pick a random clement in this affine subspace of dimension 2n — k. An affine basis for 
this subspace can again be obtained by a Gaussian elimination method. 

The above construction shows that the number of vàlid L's is Y [k=o (^ 2n k ~~ 2 fe ) Ilfc=o 2 n ~ k . In view of the 
technique for constructing random invertible matrices over Z2 given in |Ï3[ , there are probably more efficient methods 
for constructing random L's. 

To obtain a quantum network which implements the unitary operator defined by {x, L) requires decomposing L into 
elementary operations corresponding to controlled-nots and single qubit rotations. This can be done by adapting the 
methods described in |ï^| . The bàsic idea is to multiply L on the left and right be the linear operators corresponding 
to controlled-nots and rotations. Since controlled-nots correspond to elementary row/column operations in the n by n 
subblocks, one can apply Gaussian elimination methods to convert the first (say) subblock to Standard form. The tt/2 
rotations around the different axes permit elementary row/column operations between corresponding rows/columns of 
diffcrent subblocks. This can be used to transform L to I. The representation of the rcsulting sequence of controlled- 
nots and rotations is of the form (x',L). To correct the first component, one can apply &m(x~x') to the qubits. The 
total number of gates needed to implement an operator in Af is 0(n 2 ) E3|. 

Implementing T>. Being a subgroup of Af, it is clcar that each operator in T> has an efficient quantum network. The 
random phase shifts of V are described by operators D{x,B) defined by D{x,B){\k)) = i( x ^ (— \}( k > Bk ) . A random 
such operator is obtained by choosing x randomly and uniformly from all n dimensional vectors over {0, 1, 2, 3} and 
B uniformly from the set of strictly upper triangular n by n — 1 matrices. Given such an x and B, the phase shifts 
are implemented by first applying phase shifts by i Xj of |1) to the j'th qubit, and then performing a sequence of 
controlled-sign flips. The sequence of controlled-sign flips can be read off the entries of B by the following procedure: 
If Bij = 1, apply a controlled sign-flip between bits i and j. The number of operations required to apply the random 
phase shift is at most n(n — l)/2. 

Implementing T. A unitary operator U in T is defined by U\b) = \Lb) for an invertible (modulo 2) n by n matrix 
L. Any such unitary operator can be implemented using only controlled-nots. Since a controlled not corresponds to 
an elementary row/column operation, a decomposition of L into such operations yields the desired quantum network. 
The decomposition can be accomplished by the usual Gaussian elimination methods. A random invertible L can 
be generated column by column using a simpler version of the method described for the normalizer group. A more 
efficient algorithm which can be used to construct the decomposition into elementary operations at the same time is 
described in |Ï3| . 



M 



/ 
/ 



(B2) 



L T k+1 ML k+l = 0, 
Ll +í ML< k = M k+lí < k . 



(B3) 
(B4) 
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3. Entanglement 



The operations P\ and P2 required to implement the method for effective pure states by entanglement are imple- 
mcnted as follows. A phase variant equivalent to P\ for diagonal initial states is obtaincd by applying a 7r/2 rotation 
around the y axis to each of the second group of n qubits. The operation P 2 is dccomposed into the product of 
P2,i\a)\b) — ► \ax hi2 )\b) for i = 0, . . . , n — 1. Multiplication by x bi2 in GF(2") is a linear map modulo two and defincs 
an element of T which can be implemented with (9(n 2 ) controlled-nots. Each P^a can therefore be implemented with 
0(n 2 ) Toffoli gates, and P2P1 takes C>(n 3 ) operations. 
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